## [1] "Excluded 31 participants based on catch-trial performance."
We further exclude participants who seem to provide random ratings independent of the scene that they are seeing. We quantify this by computing the mean rating for each utterance across all trials for each participant and computing the correlation between a participant’s actual ratings and their mean rating. A high correlation is unexpected and indicates that a participant chose ratings at random. We therefore also exclude the data from participants for whom this correlation is larger than 0.75.
## [1] "Excluded 1 participants based on random responses."
We use the AUC function with the splines method to directly compute the AUC.
## [1] -15.673873 -7.634082 27.689755 39.061802 28.342940 24.839464
## [7] -3.903503 48.438740 34.221671 35.007151 42.933573 12.497181
## [13] 14.625947 15.838332 25.898586 -4.608395 7.694485 24.713868
## [19] 13.918289 18.356880 7.750705 14.625947 50.563930 1.911270
## [25] 4.667133 22.560162 12.927865 33.262202 34.221671 51.030906
## [31] 27.689755 -31.089861 67.664372 18.027560 65.640823 -38.306152
## [37] 34.221671 -2.754073 -5.656658 -12.434285 48.337958 12.300371
## [43] 3.259558 30.793440 -1.506618 19.851896 86.062019 36.106809
## [49] 1.796936 -2.209590 -0.298197 13.467697 17.269121 57.368724
## [55] 61.505673 32.070449 17.833441 -9.585335 -9.244998 30.517922
## [61] 24.162369 6.955771 -20.019694 34.221671 38.453389 29.339265
## [67] -34.846237 12.615582 26.187437 1.748617 42.418521 -46.613823
## [1] -0.04258585 15.85585453 34.22167115 -60.66013011 34.22167115
## [6] 6.83561287 -24.31732431 2.75461904 43.39936050 -42.74204299
## [11] -41.09733902 15.92309376 4.04704308 16.42078182 21.28083703
## [16] 5.68868015 39.97115108 -2.35709708 6.65250894 -7.87378669
## [21] 27.68975489 6.53177206 15.82300178 0.93133264 12.63474667
## [26] 13.62088092 1.65556374 0.91110730 -34.22167115 8.97860471
## [31] 18.82493447 -24.57437786 -0.32704894 11.34579805 57.78603952
## [36] -24.61519416 16.05335923 -27.86721096 -3.26253625 -27.47910903
## [41] 1.00971494 25.81779479 -5.20972611 5.42697871 -12.67403905
## [46] 16.76981730 -54.34819214 31.66801769 -2.40390717 -13.39404036
## [51] -4.12129641 -4.01584329 0.59617160 57.78603952 62.21260505
## [56] 38.64821711 -13.71704203 -68.24809490 -8.28171618 37.18261243
## [61] 27.35622313 -1.77413452 -22.23299260 34.22167115 13.85089559
## [66] 34.22167115 7.23460603 6.08361631 22.31985215 41.46056847
## [71] 45.00124944 -38.72330819
t-test and regression model with control variables:
##
## Two Sample t-test
##
## data: aucs.cautious$auc_diff and aucs.confident$auc_diff
## t = 2.9276, df = 142, p-value = 0.00398
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 4.153181 21.423418
## sample estimates:
## mean of x mean of y
## 18.04311 5.25481
##
## Cohen's d
##
## d estimate: 0.487931 (small)
## 95 percent confidence interval:
## lower upper
## 0.153596 0.822266
##
## Call:
## lm(formula = auc_diff ~ cond + test_order + first_speaker_type +
## confident_speaker, data = rbind(aucs.cautious, aucs.confident))
##
## Residuals:
## Min 1Q Median 3Q Max
## -75.676 -14.133 0.127 16.111 72.716
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.558 4.876 3.191 0.00175 **
## condconfident (probably-biased) -12.788 4.384 -2.917 0.00412 **
## test_orderreverse -2.212 4.390 -0.504 0.61504
## first_speaker_typeconfidentfirst 5.686 4.393 1.294 0.19771
## confident_speakerconfidentm 1.185 4.393 0.270 0.78771
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 26.3 on 139 degrees of freedom
## Multiple R-squared: 0.07014, Adjusted R-squared: 0.04338
## F-statistic: 2.621 on 4 and 139 DF, p-value: 0.0375
## Analysis of Variance Table
##
## Model 1: auc_diff ~ cond
## Model 2: auc_diff ~ cond + test_order + first_speaker_type + confident_speaker
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 142 97543
## 2 139 96176 3 1367.1 0.6586 0.5789